Explainable complex model

ABSTRACT

Certain aspects of the present disclosure provide techniques for generating a human readable summary explanation to a user for an outcome generated by a complex machine learning model. In one embodiment, a risk assessment service can receive a request from a user in which a risk model of the risk assessment service performs a specific task (e.g., determining the level of risk associated with the user). Once the risk model determines the risk associated with the user, in order to comply with regulations from a compliance system, the risk model can provide a user with an explanation as to the outcome for transparency purposes.

INTRODUCTION

Aspects of the present disclosure relate to a method and system forgenerating a summary explanation for an outcome of a complex machinelearning model. In particular, embodiments of the present disclosurerelate to identifying feature(s) of user data with the greatest impacton the outcome of a complex machine learning model and providing ahuman-readable explanation to allow the user to better understand theoutcome.

INTRODUCTION

The implementation of complex machine learning models for performingtasks on behalf of a user (e.g., making a decision, generating anoutcome, etc.) is becoming more and more widespread. Complex machinelearning models are trained to take into account thousands of factorsand the relationships between such factors when performing a task for auser. As compared to a human user performing the task, a complex machinelearning model is able to perform the task in less time and with ahigher degree of accuracy. Further, in some cases, it would beimpractical for a user to perform such tasks as they include reviewingthousands of factors and the relationship between such factors.

Despite the growth associated with implementing complex machine learningmodels, such implementation is not without constraints. Certainindustries are highly regulated, such as finance, pharmaceuticals,accounting, etc. In such regulated industries, an organization or entityis responsible for establishing a set of compliance regulations forprivacy, security, transparency, etc., purposes. The complex machinelearning models implemented in such regulated industries are not exemptfrom adhering to the compliance regulations.

For example, the Federal Trade Commission (FTC) regulates the financialindustry (e.g., credit reporting), and as such, the FTC requiresdecisions based on a person's financial information be explained to thatperson, especially those decisions that negatively impact the person. Inone example, under the Fair Credit Reporting Act, if someone submits aloan application and is denied the loan, then the FTC requires thatperson be provided an explanation as to why their loan application wasdenied.

Therefore, a solution is needed in order to implement complex machinelearning models in compliance with the regulations established within anindustry.

BRIEF SUMMARY

Certain embodiments provide a method for generating an explanationregarding an outcome of a complex machine learning model (e.g., riskmodel). The method generally includes accessing a set of user data fromone or more user accounts. The method further includes extracting a setof features from the set of user data corresponding to user riskactivity. The method further includes generating, via a risk model, anattribution value for each feature of the set of features. The methodfurther includes generating, based on the set of features, a risk scorecorresponding to the user activity via the risk model. The methodfurther includes determining the risk score does not meet apre-determined threshold. The method further includes generating ahuman-readable explanation indicating a reason that the risk score doesnot meet the pre-determined threshold, the generating of thehuman-readable explanation comprising determining, from the set offeatures, a feature with a highest attribution value and selecting thehuman-readable explanation based on a mapping of the human-readableexplanation to the feature with the highest attribution value.

Other embodiments provide a system configured to perform methods forgenerating an explanation regarding an outcome of a complex machinelearning model, such as the aforementioned method, as well asnon-transitory computer-readable storage mediums comprising instructionsthat, when executed by a processor of a processing system, cause theprocessing system to perform methods for generating a summaryexplanation regarding an outcome of a complex machine learning model.

The following description and the related drawings set forth in detailcertain illustrative features of one or more embodiments.

BRIEF DESCRIPTION OF THE DRAWINGS

The appended figures depict certain aspects of the one or moreembodiments and are therefore not to be considered limiting of the scopeof this disclosure.

FIG. 1 depicts an example computing environment for generating anexplanation in compliance with a regulatory authority according to anembodiment.

FIG. 2 depicts an example compliance mapping according to an embodiment.

FIG. 3 depicts an example user interface depicting an explanationaccording to an embodiment.

FIG. 4 depicts an example method for generating an explanation accordingto an embodiment.

FIG. 5 depicts an example server for generating an explanation accordingto an embodiment.

To facilitate understanding, identical reference numerals have beenused, where possible, to designate identical elements that are common tothe drawings. It is contemplated that elements and features of oneembodiment may be beneficially incorporated in other embodiments withoutfurther recitation.

DETAILED DESCRIPTION

Aspects of the present disclosure provide apparatuses, methods,processing systems, and computer readable mediums for generating anexplanation for an outcome of a complex machine learning model (e.g., arisk model).

To implement a complex machine learning model in a regulated industry,upon receiving a request from a user to perform a task (e.g., generateand/or determine an outcome), a complex machine learning model of a riskassessment service performs the task requested based on complianceregulation(s) received from a compliance system. The complex machinelearning model performs the task and provides the outcome to the userper the compliance regulation by mapping the outcome to a specificregulation (e.g., a compliance code for the explanation to provide tothe user).

In one embodiment, a risk assessment service receives a request from auser to perform a task. The submission of the request to the riskassessment service includes determining a risk level associated with theuser. The risk assessment service performs the task of determining therisk level (e.g., risk score) via a risk model (e.g., non-linear,linear, etc.). The risk model is a complex machine learning model thatis trained to review thousands of factors and correspondingrelationships between the factors to generate an outcome based on thereview of user data.

To review the factors and corresponding relationships, the riskassessment service retrieves user data, with user authorization, andextracts features from the user data. The extracted features are theninput to the risk model. By inputting the features to the risk model,the risk assessment service determines an outcome associated with theuser. If the outcome associated with the user corresponds to aparticular category, the compliance regulations require the riskassessment service to provide a summary explanation of the outcome tothe user. The risk assessment service then generates the summary bymapping the outcome to a compliance regulation that includes, forexample, a compliance code and/or a reason. The human-readableexplanation is then displayed to the user.

For example, in the financial industry, a user can submit a request to arisk assessment service for a loan via a financial services application.In some cases, the risk assessment service, upon receiving the requestfor a loan, determines the risk level associated with approving the loanto the user. The risk level can be a risk score generated by a riskmodel (e.g., non-linear, linear, etc.) that reviews thousands offeatures and relationships extracted from user data to determine whetherthe user is predicted to default on the loan if approved. If the riskscore is high (or fails to meet a pre-determined threshold), indicatingthe user is likely to default on the loan, then the risk assessmentservice can deny the user's request for the loan.

Since the financial services industries are regulated by the Fair CreditReporting Act, which requires the user be made aware of the specificreason their loan application was denied, the risk model identifies thefeature(s) that had the greatest influence or impact on determining thatthe user is likely to default on the loan if it were to be approved. Insome cases, a Shapley value is calculated by the risk model for eachfeature, and the feature(s) with the highest value impacting the riskscore (and predicted outcome) is identified. Based on the identifiedfeature(s), the risk model maps the features to a compliance regulationfrom a compliance system. In some cases, the compliance regulation caninclude a set of codes corresponding to features, and each code can beassociated with a human-readable explanation. By mapping the feature tothe compliance regulation, the risk model can identify the reason theloan application was denied and provide the reason to the user. In somecases, the explanation provided to the user is standard or customized bythe risk model.

The risk model of the risk assessment service is not limited to thefinancial services industry. The risk model can be trained andimplemented in any number of regulated industries, such as housing,pharmacy, accounting, healthcare, insurance, etc. to provide ahuman-readable explanation associated with the outcome of the riskmodel. For example, the risk model can generate an outcome and providean explanation as to why a user was denied an apartment lease (e.g.,history of late rent payments), why a user is prescribed a certaindosage of a medicine (e.g., due to age) or a particular type of medicine(e.g., due to allergies), why a user was denied insurance (e.g., historyof car accidents), etc.

Example Computing Environment for Generating an Explanation

FIG. 1 depicts an example computing environment 100 for generating asummary explanation for the outcome of a risk model implemented in aregulated industry. The example computing environment 100 includes arisk assessment service 102, a computing device 104, user database(s)110, and a compliance system 112.

As depicted, the risk assessment service 102 includes a user interface(UI) module 106 and a risk model 108. The risk assessment service 102can operate as part of a software program, application, software as aservice, etc. The risk assessment service 102 can be implemented in aregulated industry to perform specific task(s) in compliance withregulations from a compliance system 112. For example, the riskassessment service 102 can determine a requested outcome including thelevel of risk associated with a user interacting with the service andprovide the user with an explanation of the outcome and level of risk incompliance with regulations received from a compliance system 112.

The UI module 106 of the risk assessment service 102 generates a UI fora computing device 104 (e.g., smartphone, tablet, desktop, laptop, orother computing devices with same or similar capabilities) interactingwith the risk assessment service 102. Through the UI generated by the UImodule 106, the risk assessment service 102 receives a request from auser to perform a particular task. For example, a risk assessmentservice 102 implemented in the financial industry can receive a requestfor a loan via an application to determine whether or not the user isqualified for a loan. In some cases, the request may be for the usersubmitting the request to the risk assessment service. In other cases,the request may be on behalf of another user. For example, a bankemployee (e.g., a third party) can submit the request for the loan viathe application on behalf of a bank customer to determine via the riskassessment service 102 whether to provide the bank customer with theloan.

Upon receiving the request via the UI module 106, the risk assessmentservice 102 triggers the risk model 108 to perform the task. Continuingthe example above, once the risk assessment service 102 receives theloan application, the risk model 108 can generate a risk score for theuser. The risk model 108 is complex machine learning model trained toreview a large amount of user data by extracting features andrelationships from the user data and generating an outcome andexplanation to the user. For example, the risk model 108 can be anon-linear risk model (e.g., XGBoost, Scorecard, GBDT (sklearn), RandomForest, Neural Network, Logistic Regression, etc.), a linear risk model,etc. In some cases, the risk model is a XGBoost non-linear risk modelwith monotonic constraints.

In order to perform the requested task, the risk assessment service 102retrieves user data from user database(s) 110. In some cases, the userprovides authorization for the risk assessment service 102 to access theuser database(s) 110 to retrieve user data. The authorization caninclude providing the risk assessment service 102 a user name, password,credentials, or other types of data authorizing the risk assessmentservice 102 access to the user database(s) 110 on behalf of the user.The user database(s) 110 can include user data pertaining to the userthat has been collected by the risk assessment service 102 (or servicesassociated with the risk assessment service 102). The risk assessmentservice 102 can collect and store user data that risk assessment service102 is permitted by law (or regulations) and the user to collect in theuser database(s) 110.

Once the risk assessment service 102 retrieves the user data, the riskassessment service 102 extracts features from the user data. In somecases, the features are extracted from the user data by transforming theuser data into a set of categories by associating each user data in theset of user data with a respective category. The transformation of theuser data is based at least in part on knowledge extracted from and/ormodels trained on previously collected data where associations areestablished between the previously collected data and the correspondingcategory.

In the example of the loan application request submitted via anapplication, the user data can include financial data of the user thatis transformed into domain specific categories (e.g., a spendingcategory, an income category, etc.). The features are extracted fromuser data based on previously determined domain specific knowledgeand/or a set of models trained on similar data from the same domainknowledge (e.g., finance) that have established associations betweendomain specific categories and the user data. By categorizing the userdata to domain specific categories, features are identified forextraction from the user data.

The extracted features from the user data are input to the risk model108. The risk model 108 is a complex machine learning model trained toreceive as input thousands of features and relationships between suchfeatures. In some cases, the risk model 108 is trained on training datathat includes historical user data collected, historical risk scorescalculated, and the actual outcomes. As part of training the risk model108, a Weight of Evidence value is calculated for each feature in thetraining data to identify whether the feature causes the risk score toincrease or decrease. The training data is used to construct a modelthat estimates the probability of an outcome. Further, the training ofthe model is constrained based at least in part on the Weight ofEvidence value (e.g., training input data's increase or decrease to arisk score matches the direction of increase or decrease indicated bythe Weigh of Evidence value).

Continuing the example above, the risk model 108 of the risk assessmentservice 102 reviewing a loan application is trained on previous requestsfor loan applications, the user data associated with the loans, and therespective associated actual outcome (e.g., whether the loan applicantactually paid back or defaulted on an approved loan). In some cases, therisk model 108 is trained to identify based on an actual outcome where auser defaulted on an approved loan, which feature(s) the user had in thehistorical user data that resulted in the default. The risk model 108 istrained as such so that when implemented, the risk model 108 can moreaccurately predict the likelihood of a user defaulting on an approvedloan.

With the extracted features input to the risk model 108, the risk model108 generates a risk score associated with the user. The risk scorecorresponds to a predicted outcome associated with the user. In somecases, the risk model 108 also generates an attribution value associatedwith each feature input to the risk model 108 that identifies how much(or to what degree) of the outcome corresponding to the risk score isattributed to the feature. For example, a Shapley value for each featurecan be calculated by the risk model 108.

Upon calculating the risk score and generating the attribution valueassociated with each feature regarding impact on the outcome, the riskmodel 108 determines whether the predicted outcome that corresponds tothe risk score meets a pre-determined threshold. For example, thepre-determined threshold can be a range of values. If the calculatedrisk score meets the pre-determined threshold (e.g., the risk scorefalls within the range of values), then the corresponding predictedoutcome is positive. In such case, the positive predicted outcomeindicates that the user is not at risk, and, in the case of the loanapplication, eligible for the loan. If the calculated risk score failsto meet the pre-determined threshold (e.g., the risk score falls outsidethe range of values), then the corresponding predicted outcome isnegative. In such case, the negative predicted outcome indicates thatthe user is at risk, and, in the case of the loan application, noteligible for the loan.

In some cases, the pre-determined threshold (e.g., a range of values, amaximum value, etc.) is received and established by a compliance system112. For example, the compliance system 112 can provide (and updated)compliance regulations, indicating which category an outcome isassociated with and instructions regarding how to proceed.

In the example of the loan application, if the risk score exceeds thepre-determined threshold (e.g., a maximum value) and the loanapplication is rejected because the user is predicted to default on theloan, the compliance regulation indicates that the risk model 108determine the reason for rejecting the loan application. In such anexample, the risk model 108 determines the reason for rejecting the loanapplication by reviewing the Shapley values to identify which feature(s)had the highest values, which corresponds to the greatest impact on theoutcome. Based on the feature(s) with the highest Shapley values, therisk model 108 can determine, by mapping to a compliance regulation(e.g., a code), an explanation as to why the loan application wasrejected.

In some cases, the risk model 108 can provide the human-readableexplanation to the user even if the user does meet the pre-determinedthreshold. The human-readable explanation provides transparency to theuser of how the risk model 108 of the risk assessment service operates.As such, compliance regulations may request explanations to be providedto the user in every instance of assessment by the risk assessmentservice 102.

Upon determining the mapping of a feature to a compliance regulation,which can include a code associated with a explanation, the risk model108 provides the explanation to the UI module 106 to generate anddisplay the human-readable explanation to the user (or to third parties,such as the compliance system), in compliance with regulations from thecompliance system 112.

In some cases, the risk assessment service 102 (e.g., via the risk model108) can generate a standard explanation to the user. For example, ifthe user had too many instances of overdrawing from their account, thenthe standard explanation determined by the risk model 108 and displayedby the UI module 106 can state: “Due to the number of instances ofoverdrawing from your account(s), your loan application is denied.”

In other cases, the risk assessment service 102 (e.g., via the riskmodel 108) can customize the reasoning specific to the user (e.g.,identifying a specific account, transaction, etc., that resulted in theloan application being denied). For example, if the user had too manyinstances of overdrawing from their account, then the customizedexplanation displayed by the UI module 106 can state: “Due to the 5instances of overdrawing from your checking account #123456789, on Jan.7, 2020; Jan. 12, 2020; Jan. 22, 2020; Jan. 25, 2020; and Feb. 20, 2020,your loan application is denied.”

Example Compliance Mapping

FIG. 2 depicts an example diagram 200 of a compliance mapping 202. Thecompliance mapping 202 maps one or more features 204 extracted from userdata to a compliance code 206. The compliance code 206 can refer to avalue in a table, list, etc. that is established by the compliancesystem.

The mapping of the feature 204 to a compliance code 206 is based oncompliance regulation(s) generated by a compliance system and providedto the risk assessment service. In some cases, features 204 that aresemantically related can map to the same compliance code 206. Todetermine semantically related features, a risk assessment service canuse natural language processing techniques. In some cases, the riskassessment service can calculate similarity scores associated with thefeatures to determine related features (e.g., Jacard similarity index orcoefficient, Cosine similarity, etc.).

Once the feature(s) 204 are mapped to a compliance code 206, thefeature(s) 204 are associated with the compliance reason 208 establishedby the compliance system. The compliance reason 208 is a human readableexplanation that corresponds to the compliance code 206. In some cases,the compliance reason is a standard explanation that the risk model ofthe risk assessment service can provide to the user or customize beforeproviding to the user. The compliance reason 208 is determined by thecompliance system and provided to the risk assessment service.

For example, in the financial services industry, when a user is applyingfor a loan, the risk model determines a risk score for the user,predicting whether the user will default on a loan. In order to complywith regulations in the financial industry (e.g., Fair Credit ReportingAct), the risk model also provides a summary explanation to the userthat the user can understand in the event that the user's loanapplication is denied.

In order to provide an explanation to the user, the risk modelcalculates a value for how much each feature of the user data impactedthe user's risk score. For example, a Shapley value is generated foreach feature. Based on the feature(s) with the highest Shapley value,the risk model maps the feature to a compliance code. In the financialservices industry, such code can be the adverse action code thatcorresponds to a feature(s). The adverse action code can be determinedby a compliance system and provided to the risk assessment service.

For example, if the feature with the highest Shapley value is exceedingthe number of instances of overdrawing from an account in a given periodof time, then that feature is mapped to the corresponding adverse actioncode. In such an example, the adverse action code is associated with anexplanation that is provided to the user (e.g., in a generated UI). Theexplanation can indicate to the user that the reason for denying theloan application is due to “Level of delinquency on accounts.”

Once the risk model maps the feature to a code, the risk model is ableto retrieve a compliance reason 208 associated with the code. In somecases, the compliance reason 208 is associated with the compliance code206 by the compliance system. In some cases, the compliance reason 208(e.g., human-readable summary explanation) is mapped to the compliancecode 206 based on Weight of Evidence (WoE) and features. With thecompliance reason retrieved based on compliance regulation(s), the riskmodel is able to provide a reason to the user for the outcome generatedby the risk model. In some cases, the reason (e.g., human-readableexplanation) is a standard explanation. In other cases, the reason iscustomized by the risk model to provide to the user.

Example User Interface for a Summary Explanation

FIG. 3 depicts an example user interface 300 for displaying a summaryexplanation of an outcome of a complex machine learning model, asdescribed with respect to FIGS. 1-2 . The example user interface 300 isgenerated by the risk assessment service and provides an explanation,understandable by a user as to the outcome generated by the risk modelof the risk assessment service.

As depicted, the example user interface 300 illustrates the explanationof an outcome of a user requesting a loan through the risk assessmentservice of a software program, application, software as a service, etc.The example user interface 300 illustrated includes the risk score 302,the reason 304 for the outcome, a re-submission request 306, a button308 associated with the re-submission request, a request for userfeedback 310, and a button associated with the feedback.

The risk score 302 is displayed to the user as “YOUR RISK SCORE IS 89”and includes the value calculated by the risk model of the riskassessment service (“89”). The reason 304 is displayed to the user thatincludes the outcome of the user's request for a loan and the reason:“Due to the amount owed on your accounts (#12345, #98760), you arepredicted to be at risk for defaulting on the requested loan and yourloan request is DENIED.”

The reason 304 depicted in the example user interface 300 is customizedfor the user, providing specific details as to why the loan request wasdenied. In other cases, a standard reason can be included as to why theloan request was denied. The reason 304 displayed is in compliance withregulations from a compliance system for purposes of transparency to theuser.

The example user interface 300 includes a request to the user forre-submitting the request 306 (e.g., “Please re-submit your requestafter addressing the issues above.”). In some cases, the reasonidentified for the outcome and included in reason 304 is something theuser can correct. As such, the request for re-submission provides theuser another opportunity to determine whether to approve a loan to theuser.

The user can re-submit the request for a loan by selecting button 308.In some cases, the selection of button 308 takes the user to thebeginning of the loan application process. In other cases, once the usercorrects any deficiencies associated with why the loan was denied, thenby selecting button 308, the risk assessment service determines the risklevel associated with the user automatically with the correctedinformation without the user having to re-enter all of the loanapplication data.

The example user interface 300 includes a request for user feedback 310(e.g., “If you believe there is an error, please let us know or how wecan improve.”). The request for user feedback 310 is for the user toprovide feedback to the risk assessment service as to how the service isperforming.

For example, when providing feedback, the user can select the feedbackbutton 312 and enter feedback indicating that incorrect information wasused in determining the risk score, incorrect information was presentedto the user, etc., that assists the risk assessment service in trainingthe risk model to avoid such mistakes. The feedback entered can alsoinclude information about what the users liked or would like to see aspart of the outcome explanation. An authorized entity associated withthe risk assessment service can review the feedback and update the riskassessment service accordingly, and in compliance with regulations froma compliance system.

Example Method for Generating a Summary Explanation

FIG. 4 depicts an example method 400 for generating a summaryexplanation for an outcome of a complex machine learning model asdescribed with respect to FIGS. 1-3 .

At step 402, a risk assessment service receives, from a user, a requestfor a risk score and an authorization to access user data.

At step 404, a risk assessment service accesses, based on theauthorization, the user data from one or more user accounts. In somecases, the user data is stored in database(s) associated with the riskassessment service. The user data stored in the database(s) is collectedby the risk assessment service (or associated services) when the user isinteracting with the software program implementing the risk assessmentservice. The collection of user data is based on approval by the userand in compliance with laws and/or regulations regarding datacollection.

At step 406, a risk assessment service extracts a set of features fromthe user data corresponding to user risk activity. The user datacorresponds to user risk activity and can indicate high risk or low riskassociated with the user's activity. In some cases, the risk assessmentservice extracts the features from the user data by transforming theuser data to a corresponding category. By categorizing the user data,features can be identified and extracted for input to a risk model, asdescribed at step 408.

At step 408, a risk assessment service inputs the set of features to arisk model. In some cases, the risk model is a linear model, anon-linear model, etc. In some cases, the risk model is a XGBoostnon-linear risk model. In such cases, the XGBoost non-linear risk modelcan include monotonic constraints.

At step 410, a risk assessment service generates, via the risk model, anattribution value for each feature of the set of features. Theattribution value indicates how much (e.g., to what degree) a featureimpacts an outcome (e.g., a risk score). In one case, an attributionvalue is a Shapley value that determines feature attribution in the riskscore.

At step 412, a risk assessment service generates, based on the set offeatures, the risk score (e.g., risk value) corresponding to the useractivity via the risk model. In some cases, if the risk score does notmeet a pre-determined threshold (e.g., exceeding a threshold value oroutside a threshold value range), this indicates a high level of riskassociated with the user. The risk score corresponds to a predictedoutcome associated with the user.

At step 414, a risk assessment service determines the risk score doesnot meet a pre-determined threshold. In some cases, a negative predictedoutcome is determined if the risk score does not meet the pre-determinedthreshold. In other cases, a positive predicted outcome is determined ifthe risk score does meet the pre-determined threshold. For example, inthe financial services industries, where a user has submitted a loanapplication, if the risk score for the user is high and does not meetthe pre-established threshold for approving a loan application, then theuser's loan application is denied because the user is predicted todefault on the loan (e.g., negative predicted outcome).

At step 416, a risk assessment service generates a human-readableexplanation to the user indicating a reason the risk score does not meetthe pre-determined threshold. In some cases, the human-readableexplanation is based on mapping the features to a code in the complianceregulations that is associated with a human readable explanation andWoE. In such case, the explanation is provided to the user, either inthe standard format generated or a custom version of the explanationgenerated. For example, the feature that has the highest attribution(e.g., Shapley) value is identified. Based on the mapping of the featurewith the highest attribution value to a compliance code (e.g., adverseaction code, etc.), an associated explanation is selected by the riskmodel such that the risk assessment service can generate thehuman-readable explanation to display to the user.

Example Server for Generating a Summary Explanation

FIG. 5 depicts an example server 500 that may perform the methodsdescribed herein, such as the method to generate a summary explanationassociated with an outcome of a complex machine learning model asdescribed with respect to FIGS. 1-4 . For example, the server 500 can bea physical server or a virtual (e.g., cloud) server.

Server 500 includes a central processing unit (CPU) 502 connected to abus 514. CPU 502 is configured to process computer-executableinstructions, e.g., stored in memory 510 or storage 512, and to causethe server 500 to perform methods described herein, for example, withrespect to FIGS. 1-4 . CPU 502 is included to be representative of asingle CPU, multiple CPUs, a single CPU having multiple processingcores, and other forms of processing architecture capable of executingcomputer-executable instructions.

Server 500 further includes input/output (I/O) device(s) 508 andinterfaces 504, which allows server 500 to interface with I/O devices508, such as, for example, keyboards, displays, mouse devices, peninput, and other devices that allow for interaction with server 500.Note that server 500 may connect with external I/O devices throughphysical and wireless connections (e.g., external display device).

Server 500 further includes network interface 506, which provides server500 with access to external network 516 and thereby external computingdevices.

Server 500 further includes memory 510, which in this example includesreceiving module 518, accessing module 520, extracting module 522,inputting module 524, generating module 526, determining module 528,identifying module 530, risk model 532 (e.g., a non-linear risk model,linear risk model, etc.) for performing operations described in FIGS.1-4 .

Note that while shown as a single memory 510 in FIG. 5 for simplicity,the various aspects stored in memory 510 may be stored in differentphysical memories, but all accessible by CPU 502 via internal dataconnections such as bus 514.

Storage 512 further includes user data 534, which may be like the userdata such as the transaction data, as described in FIGS. 1-4 .

Storage 512 further includes feature(s) 536, which may be like thefeatures extracted from the user data, as described in FIGS. 1-4 .

Storage 512 further includes authorization data 538, which may be likethe authorization data received from a user to access a user'saccount(s), as described in FIGS. 1-4 .

Storage 512 further includes risk score 540, which may be like the riskscore calculated by the risk model, as described in FIGS. 1-4 .

Storage 512 further includes compliance regulation(s) 542, which may belike the compliance regulations received from a compliance system (e.g.,regulatory authority), as described in FIGS. 1-4 .

While not depicted in FIG. 5 , other aspects may be included in storage512.

As with memory 510, a single storage 512 is depicted in FIG. 5 forsimplicity, but various aspects stored in storage 512 may be stored indifferent physical storages, but all accessible to CPU 502 via internaldata connections, such as bus 514, or external connection, such asnetwork interfaces 504. One of skill in the art will appreciate that oneor more elements of server 500 may be located remotely and accessed viaa network 516.

The preceding description is provided to enable any person skilled inthe art to practice the various embodiments described herein. Theexamples discussed herein are not limiting of the scope, applicability,or embodiments set forth in the claims. Various modifications to theseembodiments will be readily apparent to those skilled in the art, andthe generic principles defined herein may be applied to otherembodiments. For example, changes may be made in the function andarrangement of elements discussed without departing from the scope ofthe disclosure. Various examples may omit, substitute, or add variousprocedures or components as appropriate. For instance, the methodsdescribed may be performed in an order different from that described,and various steps may be added, omitted, or combined. Also, featuresdescribed with respect to some examples may be combined in some otherexamples. For example, an apparatus may be implemented or a method maybe practiced using any number of the aspects set forth herein. Inaddition, the scope of the disclosure is intended to cover such anapparatus or method that is practiced using other structure,functionality, or structure and functionality in addition to, or otherthan, the various aspects of the disclosure set forth herein. It shouldbe understood that any aspect of the disclosure disclosed herein may beembodied by one or more elements of a claim.

As used herein, a phrase referring to “at least one of” a list of itemsrefers to any combination of those items, including single members. Asan example, “at least one of: a, b, or c” is intended to cover a, b, c,a-b, a-c, b-c, and a-b-c, as well as any combination with multiples ofthe same element (e.g., a-a, a-a-a, a-a-b, a-a-c, a-b-b, a-c-c, b-b,b-b-b, b-b-c, c-c, and c-c-c or any other ordering of a, b, and c).

As used herein, the term “determining” encompasses a wide variety ofactions. For example, “determining” may include calculating, computing,processing, deriving, investigating, looking up (e.g., looking up in atable, a database or another data structure), ascertaining and the like.Also, “determining” may include receiving (e.g., receiving information),accessing (e.g., accessing data in a memory) and the like. Also,“determining” may include resolving, selecting, choosing, establishingand the like.

The methods disclosed herein comprise one or more steps or actions forachieving the methods. The method steps and/or actions may beinterchanged with one another without departing from the scope of theclaims. In other words, unless a specific order of steps or actions isspecified, the order and/or use of specific steps and/or actions may bemodified without departing from the scope of the claims. Further, thevarious operations of methods described above may be performed by anysuitable means capable of performing the corresponding functions. Themeans may include various hardware and/or software component(s) and/ormodule(s), including, but not limited to a circuit, an applicationspecific integrated circuit (ASIC), or processor. Generally, where thereare operations illustrated in figures, those operations may havecorresponding counterpart means-plus-function components with similarnumbering.

The various illustrative logical blocks, modules and circuits describedin connection with the present disclosure may be implemented orperformed with a general purpose processor, a digital signal processor(DSP), an application specific integrated circuit (ASIC), a fieldprogrammable gate array (FPGA) or other programmable logic device (PLD),discrete gate or transistor logic, discrete hardware components, or anycombination thereof designed to perform the functions described herein.A general-purpose processor may be a microprocessor, but in thealternative, the processor may be any commercially available processor,controller, microcontroller, or state machine. A processor may also beimplemented as a combination of computing devices, e.g., a combinationof a DSP and a microprocessor, a plurality of microprocessors, one ormore microprocessors in conjunction with a DSP core, or any other suchconfiguration.

A processing system may be implemented with a bus architecture. The busmay include any number of interconnecting buses and bridges depending onthe specific application of the processing system and the overall designconstraints. The bus may link together various circuits including aprocessor, machine-readable media, and input/output devices, amongothers. A user interface (e.g., keypad, display, mouse, joystick, etc.)may also be connected to the bus. The bus may also link various othercircuits such as timing sources, peripherals, voltage regulators, powermanagement circuits, and other circuit elements that are well known inthe art, and therefore, will not be described any further. The processormay be implemented with one or more general-purpose and/orspecial-purpose processors. Examples include microprocessors,microcontrollers, DSP processors, and other circuitry that can executesoftware. Those skilled in the art will recognize how best to implementthe described functionality for the processing system depending on theparticular application and the overall design constraints imposed on theoverall system.

If implemented in software, the functions may be stored or transmittedover as one or more instructions or code on a computer-readable medium.Software shall be construed broadly to mean instructions, data, or anycombination thereof, whether referred to as software, firmware,middleware, microcode, hardware description language, or otherwise.Computer-readable media include both computer storage media andcommunication media, such as any medium that facilitates transfer of acomputer program from one place to another. The processor may beresponsible for managing the bus and general processing, including theexecution of software modules stored on the computer-readable storagemedia. A computer-readable storage medium may be coupled to a processorsuch that the processor can read information from, and write informationto, the storage medium. In the alternative, the storage medium may beintegral to the processor. By way of example, the computer-readablemedia may include a transmission line, a carrier wave modulated by data,and/or a computer readable storage medium with instructions storedthereon separate from the wireless node, all of which may be accessed bythe processor through the bus interface. Alternatively, or in addition,the computer-readable media, or any portion thereof, may be integratedinto the processor, such as the case may be with cache and/or generalregister files. Examples of machine-readable storage media may include,by way of example, RAM (Random Access Memory), flash memory, ROM (ReadOnly Memory), PROM (Programmable Read-Only Memory), EPROM (ErasableProgrammable Read-Only Memory), EEPROM (Electrically ErasableProgrammable Read-Only Memory), registers, magnetic disks, opticaldisks, hard drives, or any other suitable storage medium, or anycombination thereof. The machine-readable media may be embodied in acomputer-program product.

A software module may comprise a single instruction, or manyinstructions, and may be distributed over several different codesegments, among different programs, and across multiple storage media.The computer-readable media may comprise a number of software modules.The software modules include instructions that, when executed by anapparatus such as a processor, cause the processing system to performvarious functions. The software modules may include a transmissionmodule and a receiving module. Each software module may reside in asingle storage device or be distributed across multiple storage devices.By way of example, a software module may be loaded into RAM from a harddrive when a triggering event occurs. During execution of the softwaremodule, the processor may load some of the instructions into cache toincrease access speed. One or more cache lines may then be loaded into ageneral register file for execution by the processor. When referring tothe functionality of a software module, it will be understood that suchfunctionality is implemented by the processor when executinginstructions from that software module.

The following claims are not intended to be limited to the embodimentsshown herein, but are to be accorded the full scope consistent with thelanguage of the claims. Within a claim, reference to an element in thesingular is not intended to mean “one and only one” unless specificallyso stated, but rather “one or more.” Unless specifically statedotherwise, the term “some” refers to one or more. No claim element is tobe construed under the provisions of 35 U.S.C. § 112(f) unless theelement is expressly recited using the phrase “means for” or, in thecase of a method claim, the element is recited using the phrase “stepfor.” All structural and functional equivalents to the elements of thevarious aspects described throughout this disclosure that are known orlater come to be known to those of ordinary skill in the art areexpressly incorporated herein by reference and are intended to beencompassed by the claims. Moreover, nothing disclosed herein isintended to be dedicated to the public regardless of whether suchdisclosure is explicitly recited in the claims.

What is claimed is:
 1. A computer-implemented method, comprising:receiving a compliance regulation from a compliance system, wherein thecompliance regulation includes a mapping of at least one feature from aset of features to at least one human-readable explanation; accessing aset of user data from one or more user accounts; extracting the set offeatures from the set of user data corresponding to user risk activity;generating, via a risk model, an attribution value for each feature ofthe set of features, wherein: the risk model comprises a probabilisticmodel trained on training data including a Weight of Evidence valuecalculated for each feature of the set of features, the Weight ofEvidence value calculated for each feature corresponds to an indicationof whether each feature of the set of features causes an increase or adecrease to risk scores generated by the risk model, and adirectionality of changes to risk scores generated by the risk model isconstrained by the Weight of Evidence value calculated for each featureof the set of features; generating, based on the set of features, a riskscore corresponding to the user activity via the risk model; determiningthe risk score does not meet a pre-determined threshold; and generatinga human-readable explanation indicating a reason that the risk scoredoes not meet the pre-determined threshold, the generating of thehuman-readable explanation comprising: determining, from the set offeatures, a feature with a highest attribution value; and selecting thehuman-readable explanation based on a mapping of the human-readableexplanation to the feature with the highest attribution value.
 2. Thecomputer-implemented method of claim 1, wherein the risk scorecorresponds to a predicted outcome associated with the user.
 3. Thecomputer-implemented method of claim 2, wherein the predicted outcome isone of: a negative predicted outcome if the risk score does not meet thepre-determined threshold; or a positive predicted outcome if the riskscore meets the pre-determined threshold.
 4. The computer-implementedmethod of claim 1, wherein the risk model is trained with: historicaluser data from a set of users; historical risk scores for the set ofusers; and historical actual outcomes associated with the set of users.5. The computer-implemented method of claim 1, wherein the risk model isa XGBoost non-linear risk model.
 6. The computer-implemented method ofclaim 5, wherein the XGBoost non-linear risk model includes monotonicconstraints.
 7. The computer-implemented method of claim 1, furthercomprising: receiving feedback from the user based on the human-readableexplanation; and including the feedback in training the risk model. 8.The computer-implemented method of claim 1, wherein the extraction ofthe set of features further comprises transforming the set of user datainto a set of categories is based on associating each user data in theset of user data with a respective category.
 9. A system, comprising: amemory having executable instructions stored thereon; and a processorconfigured to execute the executable instructions in order to cause thesystem to: receive a compliance regulation from a compliance system,wherein the compliance regulation includes a mapping of at least onefeature from a set of features to at least one human-readableexplanation; access a set of user data from one or more user accounts;extract the set of features from the set of user data corresponding touser risk activity; generate, via a risk model, an attribution value foreach feature of the set of features, wherein: the risk model comprises aprobabilistic model trained on training data including a Weight ofEvidence value calculated for each feature of the set of features, theWeight of Evidence value calculated for each feature corresponds to anindication of whether each feature of the set of features causes anincrease or a decrease to risk scores generated by the risk model, and adirectionality of changes to risk scores generated by the risk model isconstrained by the Weight of Evidence value calculated for each featureof the set of features; generate, based on the set of features, a riskscore corresponding to the user activity via the risk model; determinethe risk score does not meet a pre-determined threshold; and generate ahuman-readable explanation indicating a reason that the risk score doesnot meet the pre-determined threshold, wherein in order to generate thehuman-readable explanation, the processor is configured to cause thesystem to: determine, from the set of features, a feature with a highestattribution value; and select the human-readable explanation based on amapping of the human-readable explanation to the feature with thehighest attribution value.
 10. The system of claim 9, wherein the riskscore corresponds to a predicted outcome associated with the user. 11.The system of claim 10, wherein the predicted outcome is one of: anegative predicted outcome if the risk score does not meet thepre-determined threshold; or a positive predicted outcome if the riskscore meets the pre-determined threshold.
 12. The system of claim 9,wherein the risk model is trained with: historical user data from a setof users; historical risk scores for the set of users; and historicalactual outcomes associated with the set of users.
 13. The system ofclaim 9, wherein the risk model is a XGBoost non-linear risk model. 14.The system of claim 13, wherein the XGBoost non-linear risk modelincludes monotonic constraints.
 15. The system of claim 9, wherein theprocessor is further configured to cause the system to: receive feedbackfrom the user based on the human-readable explanation; and include thefeedback in training the risk model.
 16. The system of claim 9, whereinthe transformation of the set of user data into the set of categories isbased on associating each user data in the set of user data with arespective category.